NSF PAR Search | NSF Public Access Repository

UVM Discard: Eliminating Redundant Memory Transfers for Accelerators

https://doi.org/10.1109/IISWC55918.2022.00013

Zhu, Weixi; Cox, Guilherme; Vesely, Jan; Hairgrove, Mark; Cox, Alan L.; Rixner, Scott (November 2022, 2022 IEEE International Symposium on Workload Characterization (IISWC))

An increasing number of applications benefit from heterogeneous hardware accelerators. Such accelerators often require the application to manually manage memory buffers on devices and transfer data between host and device buffers. A programming model that unifies the virtual address space across the host and devices is appealing because it enables automatic memory transfers and simplifies application-level programming. However, the automatic memory transfers can sometimes be redundant, which decreases performance. NVIDIA’s UVM (unified virtual memory) driver provides a unified virtual address space for CPU-GPU programming. This paper identifies redundant memory transfers (RMTs) as a common performance issue with UVM. To address this issue, this paper proposes a data discard directive, and evaluates two implementations of that directive, UvmDiscard and UvmDiscardLazy. This directive exploits application-level knowledge to avoid RMTs. The implementations were integrated with NVIDIA’s open-source UVM driver to demonstrate their usefulness on real-world CUDA UVM applications. For example, the use of the discard directive increases training throughput by 61.2% on a large deep learning application that oversubscribes GPU memory.

Full Text Available

Search for: All records